Search CORE

10 research outputs found

Towards Efficient Resource Allocation for Embedded Systems

Author: Hasler Mattis
Publication venue
Publication date: 06/06/2023
Field of study

Das Hauptthema ist die dynamische Ressourcenverwaltung in eingebetteten Systemen, insbesondere die Verwaltung von Rechenzeit und Netzwerkverkehr auf einem MPSoC. Die Idee besteht darin, eine Pipeline für die Verarbeitung von Mobiler Kommunikation auf dem Chip dynamisch zu schedulen, um die Effizienz der Hardwareressourcen zu verbessern, ohne den Ressourcenverbrauch des dynamischen Schedulings dramatisch zu erhöhen. Sowohl Software- als auch Hardwaremodule werden auf Hotspots im Ressourcenverbrauch untersucht und optimiert, um diese zu entfernen. Da Applikationen im Bereich der Signalverarbeitung normalerweise mit Hilfe von SDF-Diagrammen beschrieben werden können, wird deren dynamisches Scheduling optimiert, um den Ressourcenverbrauch gegenüber dem üblicherweise verwendeten statischen Scheduling zu verbessern. Es wird ein hybrider dynamischer Scheduler vorgestellt, der die Vorteile von Processing-Networks und der Planung von Task-Graphen kombiniert. Es ermöglicht dem Scheduler, ein Gleichgewicht zwischen der Parallelisierung der Berechnung und der Zunahme des dynamischen Scheduling-Aufands optimal abzuwägen. Der resultierende dynamisch erstellte Schedule reduziert den Ressourcenverbrauch um etwa 50%, wobei die Laufzeit im Vergleich zu einem statischen Schedule nur um 20% erhöht wird. Zusätzlich wird ein verteilter dynamischer SDF-Scheduler vorgeschlagen, der das Scheduling in verschiedene Teile zerlegt, die dann zu einer Pipeline verbunden werden, um mehrere parallele Prozessoren einzubeziehen. Jeder Scheduling-Teil wird zu einem Cluster mit Load-Balancing erweitert, um die Anzahl der parallel laufenden Scheduling-Jobs weiter zu erhöhen. Auf diese Weise wird dem vorhandene Engpass bei dem dynamischen Scheduling eines zentralisierten Schedulers entgegengewirkt, sodass 7x mehr Prozessoren mit dem Pipelined-Clustered-Dynamic-Scheduler für eine typische Signalverarbeitungsanwendung verwendet werden können. Das neue dynamische Scheduling-System setzt das Vorhandensein von drei verschiedenen Kommunikationsmodi zwischen den Verarbeitungskernen voraus. Bei der Emulation auf Basis des häufig verwendeten RDMA-Protokolls treten Leistungsprobleme auf. Sehr gut kann RDMA für einmalige Punkt-zu-Punkt-Datenübertragungen verwendet werden, wie sie bei der Ausführung von Task-Graphen verwendet werden. Process-Networks verwenden normalerweise Datenströme mit hohem Volumen und hoher Bandbreite. Es wird eine FIFO-basierte Kommunikationslösung vorgestellt, die einen zyklischen Puffer sowohl im Sender als auch im Empfänger implementiert, um diesen Bedarf zu decken. Die Pufferbehandlung und die Datenübertragung zwischen ihnen erfolgen ausschließlich in Hardware, um den Software-Overhead aus der Anwendung zu entfernen. Die Implementierung verbessert die Zugriffsverwaltung mehrerer Nutzer auf flächen-effiziente Single-Port Speichermodule. Es werden 0,8 der theoretisch möglichen Bandbreite, die normalerweise nur mit flächenmäßig teureren Dual-Port-Speichern erreicht wird. Der dritte Kommunikationsmodus definiert eine einfache Message-Passing-Implementierung, die ohne einen Verbindungszustand auskommt. Dieser Modus wird für eine effiziente prozessübergreifende Kommunikation des verteilten Scheduling-Systems und der engen Ansteuerung der restlichen Prozessoren benötigt. Eine Flusskontrolle in Hardware stellt sicher, dass eine große Anzahl von Sendern Nachrichten an denselben Empfänger senden kann. Dabei wird garantiert, dass alle Nachrichten korrekt empfangen werden, ohne dass eine Verbindung hergestellt werden muss und die Nachrichtenlaufzeit gering bleibt. Die Arbeit konzentriert sich auf die Optimierung des Codesigns von Hardware und Software, um die kompromisslose Ressourceneffizienz der dynamischen SDF-Graphen-Planung zu erhöhen. Besonderes Augenmerk wird auf die Abhängigkeiten zwischen den Ebenen eines verteilten Scheduling-Systems gelegt, das auf der Verfügbarkeit spezifischer hardwarebeschleunigter Kommunikationsmethoden beruht.:1 Introduction 1.1 Motivation 1.2 The Multiprocessor System on Chip Architecture 1.3 Concrete MPSoC Architecture 1.4 Representing LTE/5G baseband processing as Static Data Flow 1.5 Compuation Stack 1.6 Performance Hotspots Addressed 1.7 State of the Art 1.8 Overview of the Work 2 Hybrid SDF Execution 2.1 Addressed Performance Hotspot 2.2 State of the Art 2.3 Static Data Flow Graphs 2.4 Runtime Environment 2.5 Overhead of Deloying Tasks to a MPSoC 2.6 Interpretation of SDF Graphs as Task Graphs 2.7 Interpreting SDF Graphs as Process Networks 2.8 Hybrid Interpretation 2.9 Graph Topology Considerations 2.10 Theoretic Impact of Hybrid Interpretation 2.11 Simulating Hybrid Execution 2.12 Pipeline SDF Graph Example 2.13 Random SDF Graphs 2.14 LTE-like SDF Graph 2.15 Key Lernings 3 Distribution of Management 3.1 Addressed Performance Hotspot 3.2 State of the Art 3.3 Revising Deployment Overhead 3.4 Distribution of Overhead 3.5 Impact of Management Distribution to Resource Utilization 3.6 Reconfigurability 3.7 Key Lernings 4 Sliced FIFO Hardware 4.1 Addressed Performance Hotspot 4.2 State of the Art 4.3 System Environment 4.4 Sliced Windowed FIFO buffer 4.5 Single FIFO Evaluation 4.6 Multiple FIFO Evalutaion 4.7 Hardware Implementation 4.8 Key Lernings 5 Message Passing Hardware 5.1 Addressed Performance Hotspot 5.2 State of the Art 5.3 Message Passing Regarded as Queueing 5.4 A Remote Direct Memory Access Based Implementation 5.5 Hardware Implementation Concept 5.6 Evalutation of Performance 5.7 Key Lernings 6 SummaryThe main topic is the dynamic resource allocation in embedded systems, especially the allocation of computing time and network traﬃc on an multi processor system on chip (MPSoC). The idea is to dynamically schedule a mobile communication signal processing pipeline on the chip to improve hardware resource eﬃciency while not dramatically improve resource consumption because of dynamic scheduling overhead. Both software and hardware modules are examined for resource consumption hotspots and optimized to remove them. Since signal processing can usually be described with the help of static data ﬂow (SDF) graphs, the dynamic handling of those is optimized to improve resource consumption over the commonly used static scheduling approach. A hybrid dynamic scheduler is presented that combines beneﬁts from both processing networks and task graph scheduling. It allows the scheduler to optimally balance parallelization of computation and addition of dynamic scheduling overhead. The resulting dynamically created schedule reduces resource consumption by about 50%, with a runtime increase of only 20% compared to a static schedule. Additionally, a distributed dynamic SDF scheduler is proposed that splits the scheduling into different parts, which are then connected to a scheduling pipeli ne to incorporate multiple parallel working processors. Each scheduling stage is reworked into a load-balanced cluster to increase the number of parallel scheduling jobs further. This way, the still existing dynamic scheduling bottleneck of a centralized scheduler is widened, allowing handling 7x more processors with the pipelined, clustered dynamic scheduler for a typical signal processing application. The presented dynamic scheduling system assumes the presence of three different communication modes between the processing cores. When emulated on top of the commonly used remote direct memory access (RDMA) protocol, performance issues are encountered. Firstly, RDMA can neatly be used for single-shot point-to-point data transfers, like used in task graph scheduling. Process networks usually make use of high-volume and high-bandwidth data streams. A ﬁrst in ﬁrst out (FIFO) communication solution is presented that implements a cyclic buffer on both sender and receiver to serve this need. The buffer handling and data transfer between them are done purely in hardware to remove software overhead from the application. The implementation improves the multi-user access to area-eﬃcient single port on-chip memory modules. It achieves 0.8 of the theoretically possible bandwidth, usually only achieved with area expensive dual-port memories. The third communication mode deﬁnes a lightweight message passing (MP) implementation that is truly connectionless. It is needed for eﬃcient inter-process communication of the distributed and clustered scheduling system and the worker processing units’ tight coupling. A hardware ﬂow control assures that an arbitrary number of senders can spontaneously start sending messages to the same receiver. Yet, all messages are guaranteed to be correctly received while eliminating the need for connection establishment and keeping a low message delay. The work focuses on the hardware-software codesign optimization to increase the uncompromised resource eﬃciency of dynamic SDF graph scheduling. Special attention is paid to the inter-level dependencies in developing a distributed scheduling system, which relies on the availability of speciﬁc hardwareaccelerated communication methods.:1 Introduction 1.1 Motivation 1.2 The Multiprocessor System on Chip Architecture 1.3 Concrete MPSoC Architecture 1.4 Representing LTE/5G baseband processing as Static Data Flow 1.5 Compuation Stack 1.6 Performance Hotspots Addressed 1.7 State of the Art 1.8 Overview of the Work 2 Hybrid SDF Execution 2.1 Addressed Performance Hotspot 2.2 State of the Art 2.3 Static Data Flow Graphs 2.4 Runtime Environment 2.5 Overhead of Deloying Tasks to a MPSoC 2.6 Interpretation of SDF Graphs as Task Graphs 2.7 Interpreting SDF Graphs as Process Networks 2.8 Hybrid Interpretation 2.9 Graph Topology Considerations 2.10 Theoretic Impact of Hybrid Interpretation 2.11 Simulating Hybrid Execution 2.12 Pipeline SDF Graph Example 2.13 Random SDF Graphs 2.14 LTE-like SDF Graph 2.15 Key Lernings 3 Distribution of Management 3.1 Addressed Performance Hotspot 3.2 State of the Art 3.3 Revising Deployment Overhead 3.4 Distribution of Overhead 3.5 Impact of Management Distribution to Resource Utilization 3.6 Reconfigurability 3.7 Key Lernings 4 Sliced FIFO Hardware 4.1 Addressed Performance Hotspot 4.2 State of the Art 4.3 System Environment 4.4 Sliced Windowed FIFO buffer 4.5 Single FIFO Evaluation 4.6 Multiple FIFO Evalutaion 4.7 Hardware Implementation 4.8 Key Lernings 5 Message Passing Hardware 5.1 Addressed Performance Hotspot 5.2 State of the Art 5.3 Message Passing Regarded as Queueing 5.4 A Remote Direct Memory Access Based Implementation 5.5 Hardware Implementation Concept 5.6 Evalutation of Performance 5.7 Key Lernings 6 Summar

Technische Universität Dresden: Qucosa

Towards carbon nanotube growth into superconducting microwave resonator geometries

Author: Aziz
Basset
Benyamini
Bruno
Cao
Chen
Delbecq
Dirnaichner
Frey
Gao
Gavaler
Gramich
Götz
Hasler
Hüttel
Khalil
Kong
Kuemmeth
Laird
Mattis
Moser
Pei
Petersson
Postnikov
Ranjan
Schmid
Schmid
Schmid
Schneider
Singh
Steele
Steele
Testardi
Viennot
Viennot
Waissman
Wu
Xiang
Žemlička
Publication venue: 'Wiley'
Publication date: 14/04/2016
Field of study

The in-place growth of suspended carbon nanotubes facilitates the observation of both unperturbed electronic transport spectra and high-Q vibrational modes. For complex structures integrating, e.g., superconducting rf elements on-chip, selection of a chemically and physically resistant material that survives the chemical vapor deposition (CVD) process provides a challenge. We demonstrate the implementation of molybdenum-rhenium coplanar waveguide resonators that exhibit clear resonant behaviour at cryogenic temperatures even after having been exposed to nanotube growth conditions. The properties of the MoRe devices before and after CVD are compared to a reference niobium device.Comment: 6 pages, 4 figures, IWEPNM conference proceedin

arXiv.org e-Print Archive

University of Regensburg Publication Server

Crossref

The Orchestration Stack: The Impossible Task of Designing Software for Unknown Future Post-CMOS Hardware

Future systems based on post-CMOS technologies will be wildly heterogeneous, with properties largely unknown today. This paper presents our design of a new hardware/software stack to address the challenge of preparing software development for such systems. It combines well-understood technologies from different areas, e.g., network-on-chips, capability operating systems, flexible programming models and model checking. We describe our approach and provide details on key technologies

Open Repository and Bibliography - Luxembourg

Towards Efficient Resource Allocation for Embedded Systems

Author: Hasler Mattis
Publication venue
Publication date: 06/06/2023
Field of study

HSSS - Hochschulschriftenserver der SLUB

Towards Efficient Resource Allocation for Embedded Systems

Author: Hasler Mattis
Publication venue
Publication date: 06/06/2023
Field of study

Qucosa

Hardware Acceleration for RLNC: A Case Study Based on the Xtensa Processor with the Tensilica Instruction-Set Extension

Author: Frank H. P. Fitzek
Gerhard Fettweis
Javier Acevedo
Juan Cabrera
Martin Reisslein
Mattis Hasler
Robert Scheffel
Simon Wunderlich
Sreekrishna Pandi
Publication venue: 'MDPI AG'
Publication date: 01/09/2018
Field of study

Random linear network coding (RLNC) can greatly aid data transmission in lossy wireless networks. However, RLNC requires computationally complex matrix multiplications and inversions in finite fields (Galois fields). These computations are highly demanding for energy-constrained mobile devices. The presented case study evaluates hardware acceleration strategies for RLNC in the context of the Tensilica Xtensa LX5 processor with the tensilica instruction set extension (TIE). More specifically, we develop TIEs for multiply-accumulate (MAC) operations for accelerating matrix multiplications in Galois fields, single instruction multiple data (SIMD) instructions operating on consecutive memory locations, as well as the flexible-length instruction extension (FLIX). We evaluate the number of clock cycles required for RLNC encoding and decoding without and with the MAC, SIMD, and FLIX acceleration strategies. We also evaluate the RLNC encoding and decoding throughput and energy consumption for a range of RLNC generation and code word sizes. We find that for GF ( 2 8 ) and GF ( 2 16 ) RLNC encoding, the SIMD and FLIX acceleration strategies achieve speedups of approximately four hundred fold compared to a benchmark C code implementation without TIE. We also find that the unicore Xtensa LX5 with SIMD has seven to thirty times higher RLNC encoding and decoding throughput than the state-of-the-art ODROID XU3 system-on-a-chip (SoC) operating with a single core; the Xtensa LX5 with FLIX, in turn, increases the throughput by roughly 25% compared to utilizing only SIMD. Furthermore, the Xtensa LX5 with FLIX consumes roughly three orders of magnitude less energy than the ODROID XU3 SoC

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

Towards carbon nanotube growth into superconducting microwave resonator geometries

Author: Aziz
Basset
Benyamini
Bruno
Cao
Chen
Delbecq
Dirnaichner
Frey
Gao
Gavaler
Gramich
Götz
Hasler
Hüttel
Khalil
Kong
Kuemmeth
Laird
Mattis
Moser
Pei
Petersson
Postnikov
Ranjan
Schmid
Schmid
Schmid
Schneider
Singh
Steele
Steele
Testardi
Viennot
Viennot
Waissman
Wu
Xiang
Žemlička
Publication venue: 'Wiley'
Publication date
Field of study

Crossref

A Hardware/Software Stack for Heterogeneous Systems

Author: Asmussen Nils
Aßmann Uwe
Baader Franz
Baier Christel
Castrillon Jeronimo
Fettweis Gerhard
Fröhlich Jochen
Goens Andrés
Haas Sebastian
Habich Dirk
Hasler Mattis
Huismann Immo
Härtig Hermann
Karnagel Tomas
Karol Sven
Klüppelholz Sascha
Kumar Akash
Lehner Wolfgang
Leuschner Linda
Lieber Matthias
Ling Siqi
Menard Christian
Mey Johannes
Märcker Steffen
Nagel Wolfgang
Nöthen Benedikt
Peñaloza Rafael
Raitza Michael
Stiller Jörg
Ungethüm Annett
Voigt Axel
Völp Marcus
Wunderlich Sascha
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 17/07/2023
Field of study

Plenty of novel emerging technologies are being proposed and evaluated today, mostly at the device and circuit levels. It is unclear what the impact of different new technologies at the system level will be. What is clear, however, is that new technologies will make their way into systems and will increase the already high complexity of heterogeneous parallel computing platforms, making it ever so difficult to program them. This paper discusses a programming stack for heterogeneous systems that combines and adapts well-understood principles from different areas, including capability-based operating systems, adaptive application runtimes, dataflow programming models, and model checking. We argue why we think that these principles built into the stack and the interfaces among the layers will also be applicable to future systems that integrate heterogeneous technologies. The programming stack is evaluated on a tiled heterogeneous multicore

Qucosa

HSSS - Hochschulschriftenserver der SLUB

Technische Universität Dresden: Qucosa

An Integrated Model of Panic Disorder

Author: Alexander B.
Aronson A.
Baker R.
Biederman J.
Blechner M. J.
Bowlby J.
Bradwejn J.
Bystritsky A.
Caspi A.
Caspi A.
Charney D. S.
Coplan J. D.
Craske M. G.
Davila J. D.
Faravelli C.
Faravelli C.
Faravelli C.
Fava M.
Finn C. T.
Fonagy P.
Fonagy P.
Freud S.
Fyer A. J.
Gabbard G. O.
Gorman J. M.
Gorman J. M.
Hasler G.
Javanmard M.
Kagan J.
Kaunonen M.
Kendler K. S.
Kendler K. S.
Kent J. M.
Klein D. F.
Klein E.
LeDoux J. E.
LeDoux J. E.
Leon C. A.
Liebowitz M. R.
Lydiard R. B.
Manicavasagar V.
Mattis S. G.
Milrod B.
Milrod B.
Milrod B.
Nordahl T. E.
Oquendo M. A.
Pacchierotti C.
Panksepp J.
Panksepp J.
Peter H.
Ponto L. L.
Preter M.
Pyke R. E.
Rosenbaum J. F.
Rosenbaum J. F.
Rudden M. G.
Sakai Y.
Sakai Y.
Schweizer E.
Shear M. K.
Siever L. J.
Silove D.
Sinha S. S.
Skre I.
Stein D. J.
Stewart R. S.
Targum S. D.
Thase M. E.
Torgersen S
Viana M. B.
Weissman M. M.
Wiborg I. M.
Woods S. W.
Zubieta J. K.
Publication venue: 'Informa UK Limited'
Publication date
Field of study

Crossref

Zoonotic helminths affecting the human eye

Author: A Chikweto
A D'Alessandro
A Dalimi
A Eccher
A Hermosilla
A Jamshidi
A Komnenou
A Mattis
AJ Dorta-Contreras
AM Doezie
AR Lima
AS Dissanaike
AS Dissanaike
B Gottstein
B Urban
BN Subudhi
C Addario
C Gauci
CA De A Garcia
CA Garcia
CL Moertel
CM Bartlett
CM Bartlett
D Basset
D Botero
D Otranto
D Otranto
D Otranto
D Otranto
D Otranto
D Otranto
D Otranto
D Otranto
D Otranto
D Otranto
D Patikulsila
D Sivaratnam
DF Williams
DH Brown
DK Sen
DM Spratt
Domenico Otranto
E Arocker-Mettinger
E Bertelmann
E Bertelmann
E Ferri
E Fodor
E Kosin
E Ryan
EL Blizzard
F Bruschi
F Malacrida
F Qahtani
F Rasquin
F Sallo
F Sorvillo
G Rubinsky-Elefant
GW Foster
H Bhattacharjee
H Gungel
H Maillard
H Melhorn
H Sato
H Sato
HC Wilder
HL Zhong
HR McDonald
HW Kim
IH Liu
J Altcheh
J Baquera-Heredia
J Barlow
J Biswas
J Kociecki
J Koo
J Logar
J Shen
J Waikagul
JF Lindo
JF Sprent
JJ Vicente
JM Stewart
JS Herman
JW Yang
K Heldwein
K Jindrak
K Matoff
K Mori
K Möhl
K Prommindaroj
K Sawanyawisuth
K Sawanyawisuth
K Sawanyawisuth
K Sawanyawisuth
K Yokoi
KA Kannan
KE Jones
L Bimi
L de Visser
L Ramirez-Avila
LA Leon
LI Lau
LR Moraes
M Bhende
M Chaabouni
M Koehsler
M Mittal
M Rehák
M Rohela
M Shea
M Wesolowska
MA Goldberg
Mark L Eberhard
MB Mets
MK Zarfoss
ML Eberhard
ML Eberhard
MM Ozek
MT Anantaphruti
MT Diaz
MW Lightowlers
N Acar
NA Sabrosa
NG Rao
NN Baheti
NS Azarova
O Bain
OF Brasil
P Barua
P Dorchies
P Kern
P O'Lorcain
P Ruytoor
P Schantz
PC Beaver
PC Beaver
PC Beaver
PH Williams
PJ Gavin
PJ Irwin
PJ Pisella
Q Ou
R Khechine-Martinez
R Murthy
R Natarajan
R Papini
R Papini
R Papini
R Zakir
RA Williams
RC Anderson
RD Lopera
RL Caldeira
RM Pinto
RS Chuck
RT Cortez
S Cho
S Hasler
S Kittiponghansa
S Malhotra
S Mas-Coma
S Min
S Nithiuthai
S Pampiglione
S Pampiglione
S Pampiglione
S Punyagupta
S Sabattani
S Sinav
S Sinha
S Uni
SJ Cutler
SK Basak
SM Carden
T Higashide
T Huynh
T Sréter
T Suzuki
T Xuan le
TC Orihel
TC Orihel
TC Orihel
TP Thu
TS Singh
V Kumar
VA Miroshnichenko
VA Wiwanitkit
W Chuenkongkaew
WA Manschot
WE Burr
WK Cheung
World Health Organization
Y Koyama
Y Won
Y Yospaiboon
Z Sréter-Lancz
Z Széll
Publication venue: BMC
Publication date: 01/01/2011
Field of study

Abstract Nowaday, zoonoses are an important cause of human parasitic diseases worldwide and a major threat to the socio-economic development, mainly in developing countries. Importantly, zoonotic helminths that affect human eyes (HIE) may cause blindness with severe socio-economic consequences to human communities. These infections include nematodes, cestodes and trematodes, which may be transmitted by vectors (dirofilariasis, onchocerciasis, thelaziasis), food consumption (sparganosis, trichinellosis) and those acquired indirectly from the environment (ascariasis, echinococcosis, fascioliasis). Adult and/or larval stages of HIE may localize into human ocular tissues externally (i.e., lachrymal glands, eyelids, conjunctival sacs) or into the ocular globe (i.e., intravitreous retina, anterior and or posterior chamber) causing symptoms due to the parasitic localization in the eyes or to the immune reaction they elicit in the host. Unfortunately, data on HIE are scant and mostly limited to case reports from different countries. The biology and epidemiology of the most frequently reported HIE are discussed as well as clinical description of the diseases, diagnostic considerations and video clips on their presentation and surgical treatment. <it>Homines amplius oculis, quam auribus credunt</it> Seneca Ep 6,5 Men believe their eyes more than their ears</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

Archivio istituzionale della ricerca - Università di Bari

PubMed Central